

/*


	This program takes the data extract downloaded from IPUMS-CPS 
	(cps.ipums.org) and formats the data and labels variables to make
	it easier to use. 
		
	Original author: Sheela Kennedy (last update May 31, 2020)
	Modified by: Wonjeong Jeong

	This program reads in CPS-IPUMS extract, formats CPSID variables, and identifies cohabiting couples.

	IPUMS extracts include observations for each person-month, so researchers may wish to prepare 
	individual-level analytical variables at this stage. Household members other than the householder 
	and their spouse and partner will be dropped when longitudinal files are produced. 
	Any information needed about these individuals in analysis will need to be identified 
	and attached to the records of the householder and their partner before linking occurs.


*/

/*

	IPUMS-CPS extract must include the following variables:
	Longitudinal linking variables: 
		Linking keys: CPSID and CPSIDP
		Individual longitudinal weights (to identify linkable samples): LNKFW1MWT, LNKFWMIS45WT, LNKFW1YWT 
		(or those weights appropriate to your analysis)
	Sample information: MISH, MONTH, YEAR
	Partnership variables:  RELATE, MARST, SPLOC
	Spouse's attached characteristics: CPSIDP_SP, SEX_SP, RELATE_SP 
	Demographic characteristics: AGE, SEX, RACE
	Geography: STATEFIP (optional, used in the racebridge program)
	Weight variable: HWTFINL, the household weight for the basic monthly surveys
	Analytic variables: chosen by user

	Program assumes that user has downloaded extract as a STATA file.
*/

clear
set more off

* User must fill in the paths and file names for the following local macros *
local indir "../replication-package/data"
local ipumsdta "~/Dropbox/DATA/US/CPS/cps_00162.dta"
local setup "setup/01_setup.dta"


cd "`indir'"

use "`ipumsdta'", clear

* reformat identifiers to display all digist
format cpsid* %14.0f
order cpsid cpsidp cpsidp_sp
sort cpsid cpsidp cpsidp_sp

list cpsid cpsidp cpsidp_sp in 1/10


* Identify cohabiting couples

	* identify householders who are unmarried, and households w/these heads 
	gen unmarrhd = (relate==101) & marst!=1 
	bys year month cpsid: egen unmarrhdH = max(unmarrhd) 

	* identify unmarried partners, and households w/unmarried partners, requiring that head is also unmarried 
		
	gen unmarrp = inlist(relate, 1114, 1116, 1117) & marst!=1 
	bys year month cpsid: egen unmarrpH = max(unmarrp)
	replace unmarrpH = 0 if unmarrpH ==1 & unmarrhdH==0 
	tab unmarrpH 
    
  * identify cohabitors  ;
	gen cohab = unmarrpH==1 & (unmarrp==1|unmarrhd==1) 
	label variable cohab "Cohabiting in unmarried partnership"
	
	drop unmarrhd unmarrhdH unmarrp

compress

save "`setup'", replace


